AI029
Reinforcement Learning: An Introduction
Multi-step Bootstrapping and Eligibility Traces
Learning Objectives
- Define the n-step return and its role in unifying TD(0) and Monte Carlo methods.
- Evaluate the bias-variance tradeoff inherent in n-step bootstrapping.
- Explicate the forward and backward views of eligibility traces.
- Implement the TD(lambda) algorithm and understand the importance of the lambda parameter.